<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <title>Eclipse 3.0 builder issues</title>
  <link rel="stylesheet" href="http://dev.eclipse.org/default_style.css" type="text/css">
</head>
<body text="#000000" bgcolor="#FFFFFF">
 
<h1>Eclipse 3.0 builder issues</h1>
<font size="-1">Last modified: November 13, 2003</font> 
<p>
This document outlines the current pressing builder
problems, maps out the solution space, and will eventually propose concrete changes
for Eclipse 3.0 and if necessary for a 2.1.x maintenance release. Since this is not
a committed plan item, no promises are being made at this point.
</p>
<h3>Problems</h3>
<h4>P1. Dynamic classpaths, static project references</h4>
<p>
There are many different PDE self-hosting styles.  Some users build and run against
plugins outside the workspace, some use binary or source projects in the workspace,
and some use a combination of the two.  Every time a user switches from building
against external libraries to building against internal projects, the classpaths of all
downstream plugins need to be updated.  When the set of referenced projects on the
classpath change, so do the platform project references.  These references are used
by the platform when computing the default project build order.
</p>
<p>
Before the Eclipse 2.1 release, JDT introduced the notion of classpath containers.
Classpath containers abstract away volatile sets of references so that the .classpath
file does not need to change every time these reference sets change.  The two common
uses of containers are for JRE libraries, and PDE project references.  This approach 
solves the problem of .classpath files constantly changing as the user's self-hosting
style changes (or when the JRE changes).  The remaining problem is that the project 
references, manifested in the .project file, do not have a corresponding dynamic 
container facility.  Therefore the .project file continues to change whenever the 
user switches between external and internal project references.
</p>
<h4>P2. Builders need more context</h4>
<p>
Individual project builders are not supplied with enough context information when
they are invoked.  In particular, they do not know whether the current build is a single
project build or a workspace build.  Also, there is no consistent hook that builder
implementors can use to be notified when the build process is starting and finishing.
If this information was available, builder implementors would have more flexibility
in optimizing multiple project builds.
</p>
<p>
Without going into too much detail, consider a simple case of projects A and B,
where B depends on A and so must be built after it.  When A is built, it could leave
behind a &quot;cache&quot; that describes in more detail exactly what changes builder A made.
Builder B could then read the cached information and optimize its build accordingly.  However,
with manual project builds, project A could be built several times before project B is
built.  In this simple case, builder A could incrementally update this cache until
B is finally built.  However, if there is also a project C that depends on A, it could be
built at a different time.  In the general case, it is impossible for a builder to leave
behind a cache for dependent builders to read, because it can never know the appropriate
times to update or flush that cache.
</p>
<p>
To further complicate matters, other builders may have been installed on projects A or B
that run in between the two builders that are trying to optimize downstream dependency
changes.  Builders cannot reliably determine if other builders have been run that affect
them, or whether changes they see in a delta have been caused by their own builder
or some other builder.
</p>
<h4>P3. Build scopes</h4>
<p>
Some Eclipse users work with massive workspaces of highly interconnected resources.
These workspaces often take a very long time to build when a resource with a large
number of downstream references changes.  Builder implementors are able to optimize
these cases to some extent, but in cases where thousands of referenced resources are
genuinely affected by a change, these optimizations can only go so far.
</p>
<p>
One coping strategy that users have adopted is to only build small subsets of
the workspace frequently, and perform full workspace builds less often.  This essentially
batches changes so that a single long build can replace frequent long builds, thus
freeing up more time for users.
</p>
<p>
The current Eclipse build facilities are not well suited to this approach.  Autobuild
and the incremental build command (invoked with Ctrl+B), work only at a workspace
scope.  To build smaller sets of projects, users have to perform cumbersome manual
steps involving multi-selecting the set of projects, and invoking &quot;Build Project&quot;
from the &quot;Project&quot; context menu.  Autobuild has established a high
expectation about the transparency of building, making these manual steps unacceptable.
</p>

<h3>Solutions</h3>
<h4>S1. Workspace builders</h4>
<p>
Currently a workspace build runs roughly as follows:
<pre>
   for each project P(i) in workspace {
      for each builder B(j) installed on P(i) {
         run builder P(i)B(j)
      }
   }
</pre>
This process could be inverted by allowing builders to be installed on the entire workspace
that have complete control over how the projects they care about are built.  The build
process would then run as follows:
<pre>
   for each workspace builder B(i) {
      run builder B(i)
   }
</pre>
And a workspace builder B(i) would have a loop within its build like this:
<pre>
   for each project P(j) in workspace {
      if (P(j) has nature for builder B(i) {
         build project builder B(i) on project P(j)
      }
   }
</pre>
</p>
<p>
The order in which builders run would be established by declarative rules in the
workspace builder extension point.  These rules would be resolved by platform
as a topological graph ordering with cycles treated as unrecoverable errors.
</p>
<p>
<b>Pros:</b> With this approach, each builder has complete control over the build order it uses.
Project references are no longer needed, so problem P1 goes away.  The builder
has all the context information it needs, so the optimizations discussed in problem P2
are now possible. It is also no longer possible for other builders to run interleaved
with a particular builder as it iterates over the projects.  Thus the builder has reasonably
complete knowledge of what changes have happened to each project, and what
the content of those changes are.
</p>
<p>
<b>Cons:</b> This is a breaking change to the workspace build order.  If project
P1 and P2 have builders B1 and B2, the old order is:
<pre>
P1B1, P1B2, P2B1, P2B2
</pre>
And the new order is:
<pre>
P1B1, P2B1, P1B2, P2B2
</pre>
If there are projects relying on this order, they will be broken by this change.  It is
true that since the platform has always exposed API to run an individual builder on
a single project, that all possible build orders have always been permitted and all 
builders should be able to handle it.  However, this is a breaking change to 
autobuild and to the behaviour of the API method <tt>IWorkspace.build</tt>.
</p>
<p>
Second, there is a large class of builders that piggyback on other builders, and rely
on those builders to establish the build order for them.  For example, a client may
install a builder that acts as a pre- or post-processor on the java builder, generating
extra source files or copying/mutating class files.  These builders typically have no
knowledge of the correct build project order.  In a world of workspace builders, the
workspace builder for these secondary builders would need a mechanism for computing
a reasonable project build order.  The java builder and other &quot;primary&quot;
builders would need to expose such an API and all secondary builders would use this
(non generic) API to compute a reasonable build order.
</p>
<p>
In the general case, it would be impossible to provide a backwards compatibility mode
for builders implemented against Eclipse 2.1.  For example, say a 2.1 workspace is
opened in 3.0, and it has projects with builder order {B1, B2}.  Say B2 is now a 
workspace builder that defines its own project build order.  B2 was previously using
project references, but it immediately removes all project references when its plugin
is first loaded since these are no longer used (if it did not do this it would not be
solving problem P1 in our list).  How does the platform determine an appropriate 
project order for running builder B1?  Since B2 has not run, it can not infer an
order from B2.  B2 would have to expose some method that pre-declares the order
it intends to use.  B1, in the absence of its own build order, would use this declared
order for its own projects.  Now say there are two workspace builders (B2 and B3) on 
each project, and each declares a different project order.  It is now impossible to
infer an appropriate build order for B1.
</p>
<h4>S2. Pluggable build strategy</h4>
<a name="s2"/>
<p>
A less ambitious proposal is to allow pluggable build strategies to determine
build order, but then build each project in the same manner as Eclipse 2.1. A
build strategy extension point would allow plugins to provide a subclass of
BuildStrategy:
<pre>
public abstract class BuildStrategy {
	public abstract void build(IProject[] projects, IProgressMonitor monitor);
	public final void build(IProject project, IProgressMonitor monitor) {}
	public final int getKind() {...}
	... other methods to provide additional context if needed ... 
}
</pre>
</p>
<p>
A build strategy is a first class participant in the build progress.  Each installed
strategy is invoked for every single build (workspace, project, incremental, 
full, autobuild, etc).  A strategy instance is given an array of projects to build,
and it must call BuildStrategy.build(IProject, IProgressMonitor) on the projects that it is interested
in building.  This callback will then invoke all the builders on that project in the order 
specified on the project build spec (same as previous Eclipse releases). The strategy 
may build zero or more projects in the array, and it may build the same project 
multiple times (in the case of cycles).
</p>
<p>
If multiple strategies are present, they are invoked in no
particular order. Strategies are only given the &quot;left over&quot; 
projects that no previous strategy chose to build.  Finally, a default platform
strategy runs to build the remaining unbuilt projects, using project references to determine
the order.
</p>
<p>
(Note: I am calling this solution &quot;build strategy&quot; to avoid confusing it with S1.  If we
go with this route, it could easily be renamed to &quot;workspace builder&quot;
since everyone seems to like that term.)
</p>
<p>
<b>Pros:</b> A build strategy can be used as an alternative to project references
for setting the project build order.  Since this mechanism is dynamic, it fits with the
JDT dynamic build path story and avoids modifying the project description file every
time the build path changes (solving problem P1). This hook also gives a build strategy 
implementor all the context information they need, including the complete set of 
projects being built, and an opportunity to add pre- and post-processing on the 
entire build cycle (solving problem P2).
</p>
<p>
This approach is also backwards compatible.  If no builders made any changes, then
the default build strategy would continue to determine build order from project references
as it did in previous releases.  Each builder that was influencing the build order by 
changing project references can now supply a build strategy extension if they want
to compute build order dynamically.
</p>
<p>
<b>Cons:</b> If there are multiple strategies installed, they may only operate
on disjoint sets of projects. This is the same limitation that existed with project
references, since multiple builders setting the project references prior to Eclipse 3.0
would have been overwriting each other's references.
</p>
<p>
This does not solve the final problem discussed in P2, the interleaving of other builders 
between two invocations of a given builder extension.  This problem cannot be
solved without breaking changes to the workspace build order.
</p>
<p>
Another drawback is that this puts responsibility for complex aspects of the build
infrastructure into the hands of strategy implementors.  For example, handling
cycles in the build order, and handling a user specified build order. This increase
in delegated responsibility comes with the usual cost of decreased flexibility
to make future changes to the build infrastructure.
</p>

<h4>S3. Working set build</h4>
<p>
The two build actions that currently surface in the UI are project build and workspace
build.  A third option, working set build, could be introduced for building a subset
of the workspace.  This option would only be visible after entering some advanced
mode (for example from a preference page) to avoid confusion for new users.
This option would only run builders on the specified set of projects, avoiding builds 
of either prerequisite or downstream dependent projects.  This option could be
made available in both autobuild and manual build modes.  While a manual working
set build can be done using current API, it would be beneficial to introduce a new 
core API method for working set builds.
This would ensure correct computation of build order, and allow for participation 
by other proposed mechanisms such as workspace builders or pluggable 
build strategies.
</p>
<p>
This solution aims to address only problem P3 listed above.  It may work in
conjunction with other solutions listed in this document. 
</p>
<p>
<b>Pros:</b> Allows user with large, heavily interconnected workspaces to
build smaller sets of projects more often, and defer building the entire workspace
until a larger number of changes have been made.
</p>
<p>
<b>Cons:</b> Working set autobuild in particular would introduce significant
complexity.  To avoid compatibility problems it would have to be treated as
a disjoint mode from workspace autobuild.  I.e., <code>IWorkspace.isAutoBuilding</code>
would return false when in working set autobuild mode, and some new flag could be
used to find out if the workspace is in working set autobuild mode. This mode would
be difficult to explain to both API clients and users.  Workspace autobuild is simple
to explain: when in this mode, all projects are always in a fully built state.
Working set autobuild is more complex: when in this mode, and when all prerequisites
of projects in the working set are in a built state, then all projects in the working set
will remain in a built state, but all projects outside of the working set will not
be built. This story would be easier to explain if we required that the working set
be internally consistent (not referencing projects outside the working set), but this
assertion is difficult to verify if we introduce a dynamic build order such as 
workspace builders or pluggable build strategies.  Similar difficulties exist with 
manual working set build, but the notion of manual build at a different scope
already exists.
</p>
<h4>S4. Disabling builders on a per-project basis</h4>
<p>
An alternative to working set build is to add new API for disabling build on a
per-project (or even per-builder) basis.  The setting would apply to all forms of builds 
(incremental ,full, auto, project, workspace).  The value would be persisted by the 
platform.  Builds scoped at the workspace level (including autobulid) would simply omit 
projects for which building is disabled.
</p>
<p>
This solution is essentially equivalent in functionality to working set build.  It answers
the question &quot;what not to build?&quot; rather than the question &quot;what
to build?&quot;.  This is perhaps an easier story to explain to users (similar concept
in some ways to project close/open). With working sets there is confusion about
what types of builds it will apply to, in addition to overloading the working set 
concept currently used in the UI for filtering views.
</p>
<p>
If this solution was carefully implemented, it could also solve the long standing
<a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=21311">
enhancement request</a> to be able to turn off the java builder (in order to replace it with
javac or other compiler).
</p>

<h3>Proposed action</h3>
<p>
TBD
</p>




















































</body>
</html>